I like to save all of the code that I use for my Mobius and assessment quizzes so I have good examples to follow later when doing the analysis. I’m partial to .rmd files (at least for the Mobius Skills Quizzes) because I can easily organize different sections, add notes/explanations, and put in LaTeX equations. If you prefer R scripts, then just try to keep them organized in a new folder where you’ll easily be able to find them again. But the most important thing is to just save your code somewhere for future reference. If you do everything in the console, that code will be really difficult to get back after a few days.

How to make effective notes

What is a good note, and what is a bad note?

A bad note:

\[ \text{-start note-} \]

Sneaky geom_bar()s

If you have a geom_bar() of counts, but you want proportions, try a position = “fill”:

geom_bar(position = fill)

Or, geom_bar with stat = “identity”

\[ \text{-end note-} \]

What’s bad about this? Why is it not effective?

This note is incomplete in many ways! First of all, it doesn’t do a very good job of explaining the context of its usefulness. In other words, when will this information be useful? Furthermore, this note doesn’t even do a good job of explaining how to do the thing its talking about. I was clearly in a hurry when I wrote it, because the code wouldn’t even run as is. Furthermore, it isn’t complete enough for me to reproduce; if I wanted to actually do what it’s talking about I’d still be lost. Worst of all, it gives 2 options (position = “fill” and stat = “identity”) but never explains the difference or which is preferred!

How might I make this note more effective?

A good note:

\[ \text{-start note-} \]

Things to Group By

One of ggplot’s greatest features is the capacity to easily group data according to variables. This is most commonly used to assign different colors:

Color and Fill

In a GGPlot object, the plot type determines which variables apply to each plot. Perhaps the most critical are color and fill. Depending on the plot, some have color, some have fill, and some have both…but one of them is the color for the edge of the shapes. Look at the GGPlot Cheat Sheet(s) for help on which is which. Also note that if you want to manually set a specific color, you must specify the argument in the ggplot object or layer call but NOT inside the aes(), as this will attempt to use the color name given as a data frame variable to sort the data by. The same is true for all aesthetics you wish to set manually.

Pottery_c <- Pottery %>% dplyr::filter(Site != "Caldicot") #removing Caldicot due to small sample
p <- ggplot(Pottery_c) + labs(title = "Chemical Composition of Ancient UK Pottery")
#
#geom_boxplot uses FILL
p + geom_boxplot(aes(Al, fill = Site), outlier.shape = NA)

#But geom_point (scatterplots) uses COLOR
p+ geom_point(aes(Al,Fe, color = Site))
ggplot(GSS_clean, aes(spanks, fill = has_rel)) + #Create plot object with assigned fill
  geom_bar(position = "dodge") + #Add a Bar Plot to the object to visualize
  theme_solarized() + #Theme for quick change of overall aesthetic
  scale_fill_solarized() + #Colorize plot to match theme
  scale_x_discrete(labels = c("Strongly Oppose", "Oppose", #Change the item labels on the x axis
                              "Approve", "Strongly Approve")) + 
  labs(title = "Are Christians more likely to spank their children?",  #change title
       x = "Spanking Preference",  #change x axis title
       y ="Count") + #change y axis title
  guides(fill = guide_legend(title = "Religious Preference")) #change Legend Title
Other aesthetics to group by

Color/fill are fine, but also consider alpha, size, and shape, all spelled that way. If using geom_smooth, try linetype. Or, set a boolean with a true false statement in the aesthetic mapping. Lastly, if for insane reason you want to group variables but not change their aesthetics in any way (possibly useful maybe in geom_smooth()), use group = as the aesthetic argument.

See R For Data Science, Chapter 3 for more.

\[ \text{-end note-} \]

What makes this note effective? What does it convey?

This note takes something that I need to do frequently, assign colors in a ggplot, and explains the context and workings of the necessary code. It clarifies a common misconception by disambiguating the times we want to specify color in an aes() argument instead of directly in the geom_* object, and why.

Further, this note give detailed example code, which in its original location actually runs. The output and the code are visible to the interested reader! Finally, it expands on the topic and gives information about where I can turn for more help on the topic.

What else could I add to improve the efficacy of this note?

When do I put something in my notes?

We should avoid the extra work of overcrowding, but also provide enough detail. How do we meet that balance? I follow this rule:

When I’m not 100% sure I’m going to remember it, and there’s a good chance I’ll need it again, I put it in my notes!

How do I connect concepts in my notes?

I’m awful at organization, so I just put everything in one big long document with carefully named titles. The important part is that I create a searchable document that speaks to me. I’m not writing these notes for anyone else, so if I think I’ll remember “how to do that one stinky function that does the thing with the wide data” better than “how to use pivot_longer()”, I put the thing I’ll remember as the title and then I know what to search later. Links are helpful to connect concepts together, but the most effective way to learn how topics relate is to learn more, write more, and do more statistics!

Add another tab in the .rmd files under the “Making Inference” tab

As it shows in the notebook under R Help/R Markdown Hints, you can make tabs using {.tabset}. All of the tests under the “Making Inference” tab uses these tabsets to go through Overview, R Instructions, and Explanation. You can, of course, easily add your own comments and notes throughout the rmd file (which is also a good idea), but if you want to have them all in one spot, you can add another tab to these sections!

In this example, I’m going to add another tab to the t Test section to keep all of my notes about it organized in one spot.


Find the End of the {.tabset}

If you don’t care about nice formatting and just want the information, then adding another tab is as simple as add the correct amount of ‘#’ after the </div> and before the ----


If you do care about the nice formatting, then you can just add the same code that the other tabs have at the beginning and end. <div style="padding-left:125px;"> </div>




No matter how you choose to add your notes, be sure to knit the Rmd file afterwards to update the html file.